Large Dataset Compression Approach Using Intelligent Technique
نویسندگان
چکیده
Data clustering is a process of putting similar data into groups. A clustering algorithms partition data set into several groups such that the similarity within a group is larger than among groups. Association rule is one of the possible methods for analysis of data. The association rules algorithm generates a huge number of association rules, of which many are redundant. The main idea of this paper is to compress large database by using clustering techniques with association rule algorithms. In the first stage, the database is compressed by using clustering techniques followed by association rules algorithm. Adaptive k-means clustering algorithm is proposed with apriori algorithm. Due to many experiments by using the adaptive k-means algorithm and apriori algorithm together it gives better compression ratio and smaller compressed file size than the compression ratio and compressed file size that are given from using each algorithm alone. Several experiments were made in several different sizes of database. The apriori algorithm increases the compression ratio of the adaptive kmeans algorithm when hey are used together but it takes more compression time than the adaptive kmeans takes. These algorithms are presented and their results are compared.
منابع مشابه
Implementation of VlSI Based Image Compression Approach on Reconfigurable Computing System - A Survey
Image data require huge amounts of disk space and large bandwidths for transmission. Hence, imagecompression is necessary to reduce the amount of data required to represent a digital image. Thereforean efficient technique for image compression is highly pushed to demand. Although, lots of compressiontechniques are available, but the technique which is faster, memory efficient and simple, surely...
متن کاملIntelligent scalable image watermarking robust against progressive DWT-based compression using genetic algorithms
Image watermarking refers to the process of embedding an authentication message, called watermark, into the host image to uniquely identify the ownership. In this paper a novel, intelligent, scalable, robust wavelet-based watermarking approach is proposed. The proposed approach employs a genetic algorithm to find nearly optimal positions to insert watermark. The embedding positions coded as chr...
متن کاملFeature Extraction and Efficiency Comparison Using Dimension Reduction Methods in Sentiment Analysis Context
Nowadays, users can share their ideas and opinions with widespread access to the Internet and especially social networks. On the other hand, the analysis of people's feelings and ideas can play a significant role in the decision making of organizations and producers. Hence, sentiment analysis or opinion mining is an important field in natural language processing. One of the most common ways to ...
متن کاملDiagnosis of Diabetes Using an Intelligent Approach Based on Bi-Level Dimensionality Reduction and Classification Algorithms
Objective: Diabetes is one of the most common metabolic diseases. Earlier diagnosis of diabetes and treatment of hyperglycemia and related metabolic abnormalities is of vital importance. Diagnosis of diabetes via proper interpretation of the diabetes data is an important classification problem. Classification systems help the clinicians to predict the risk factors that cause the diabetes or pre...
متن کاملAn Intelligent System’s Approach for Revitalization of Brown Fields using only Production Rate Data
State-of-the-art data analysis in production allows engineers to characterize reservoirs using production data. This saves companies large sums that should otherwise be spend on well testing and reservoir simulation and modeling. There are two shortcomings with today’s production data analysis: It needs bottom-hole or well-head pressure data in addition to data for rating reservoirs’ characteri...
متن کامل